# load necessary packages
library(tidyverse)
library(mosaic)
library(DataComputing)
library(ggplot2)
What factors are common between all hall of fame MLB baseball players?
This document is required to indicate where various requirements can be found within your Final Project Report Rmd. You must indicate line numbers as they appear in your final Rmd document accompanying each of the following required tasks. Points will be deducted if line numbers are missing or differ signficantly from the submitted Final Rmd document.
Description: (1) Analysis includes at least two different data sources. (2) Primary data source may NOT be loaded from an R package–though supporting data may. (3) Access to all data sources is contained within the analysis. (4) Imported data is inspected at beginning of analysis using one or more R functions: e.g., str, glimpse, head, tail, names, nrow, etc
HallOfFame <- read_csv("core/HallOfFame.csv")
AllstarFull <- read_csv("core/AllstarFull.csv")
Salaries <- read_csv("core/Salaries.csv")
Batting <- read_csv("core/Batting.csv")
head(HallOfFame)
glimpse(HallOfFame)
head(Salaries)
glimpse(Salaries)
Description: Students need not use every function and method introduced in STAT 184, but clear demonstration of proficiency should include proper use of 5 out of the following 8 topics from class: (+) various data verbs for general data wrangling like filter, mutate, summarise, arrange, group_by, etc. (+) joins for multiple data tables. (+) spread & gather to stack/unstack variables (+) regular expressions (+) reduction and/or transformation functions like mean, sum, max, min, n(), rank, pmin, etc. (+) user-defined functions (+) loops and control flow (+) machine learning
InductedP<-
HallOfFame%>%
filter(inducted == "Y")%>%
select(playerID, yearID)
InductedP
Money<-
Salaries%>%
select(teamID, playerID, salary)
Money
HallMoney<-
InductedP%>%
inner_join(Money, by = c("playerID" = "playerID"))
HallMoney
WholeLeague <-
Batting %>%
filter(G > 20)%>%
select(playerID, yearID)
WholeLeague